02. Data Analyst Project Overview
Project Overview
In this project, you will analyze a dataset and then communicate your findings about it. You will use the Python libraries NumPy, Pandas, and Matplotlib to make your analysis easier.
What do I need to install?
You will need an installation of Python, plus the following libraries:
* pandas
* numpy
* matplotlib
* csv
We recommend installing Anaconda, which comes with all of the necessary packages, as well as Jupyter notebook.
Why this Project?
This project will introduce you to the data analysis process that you will be using throughout the rest of the Nanodegree program. In this project, you will go through the entire process so that you know how all the pieces fit together. Later Nanodegree projects will focus on individual pieces of the data analysis process. In this project, you will also gain experience using the Python libraries NumPy, Pandas, and Matplotlib, which make writing data analysis code in Python a lot easier!
What will I learn?
After completing the project, you will:
- Know all the steps involved in a typical data analysis process
- Be comfortable posing questions that can be answered with a given dataset and then answering those questions
- Know how to investigate problems in a dataset and wrangle the data into a format you can use
- Have practice communicating the results of your analysis
- Be able to use vectorized operations in NumPy and Pandas to speed up your data analysis code
- Be familiar with Pandas' Series and DataFrame objects, which let you access your data more conveniently
- Know how to use Matplotlib to produce plots showing your findings
Why is this Important to my Career?
This project will show off a variety of data analysis skills, as well as showing potential employers that you know how to go through the entire data analysis process.